IXIR: A statistical information distillation system

نویسندگان

  • Michael Levit
  • Dilek Z. Hakkani-Tür
  • Gökhan Tür
  • Daniel Gillick
چکیده

The task of information distillation is to extract snippets from massive multilingual audio and textual document sources that are relevant for a given templated query. We present an approach that focuses on the sentence extraction phase of the distillation process. It selects document sentences with respect to their relevance to a query via statistical classification with support vector machines. The distinguishing contribution of the approach is a novel method to generate classification features. The features are extracted from charts, compilations of elements from various annotation layers, such as word transcriptions, syntactic and semantic parses, and information extraction (IE) annotations. We describe a procedure for creating charts from documents and queries, while paying special attention to query slots (free-text descriptions of names, organizations, topic, events and so on, around which templates are centered), and suggest various types of classification features that can be extracted from these charts. While observing a 30% relative improvement due to non-lexical annotation layers, we perform a detailed analysis of the contributions of each of these layers to classification performance. 2009 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fault diagnosis in a distillation column using a support vector machine based classifier

Fault diagnosis has always been an essential aspect of control system design. This is necessary due to the growing demand for increased performance and safety of industrial systems is discussed. Support vector machine classifier is a new technique based on statistical learning theory and is designed to reduce structural bias. Support vector machine classification in many applications in v...

متن کامل

System Identification of a Steam Distillation Pilot- Scale Using Arx and Narx Approaches

This paper presents steam temperature models for steam distillation pilot-scale (SDPS) by comparing Pseudo Random Binary Sequence (PRBS) versus Multi-Sine (M-Sine) perturbation signal Both perturbation signals were applied to nonlinear steam distillation system to study the capability of these input signals in exciting nonlinearity of system dynamics. In this work, both linear and nonlinear ARX...

متن کامل

Canonical Form and Separability of PPT States in C2 ⊗ CM ⊗ CN Composite Quantum Systems

Quantum entangled states have become one of the key resources in the rapidly expanding field of quantum information processing and computation [1, 2, 3, 4]. Nevertheless, the study of physical character and mathematical structure of the quantum entanglement is far from being satisfied. One even does not have a general criterion to judge if a quantum (mixed) state is entangled or not. For bipart...

متن کامل

The geometry of separation processes: A horse-carrot theorem for steady flow systems

– The horse-carrot theorem bounding the entropy production in processes with a fixed number of relaxations is extended to steady flow processes. The dissipation turns out to be related to a path of flows rather than states. The example of fractional distillation is presented and shows how null directions for the geometry turn out to be useful in the analysis. The implied distillation column des...

متن کامل

Model Predictive Inferential Control of a Distillation Column

Typical production objectives in distillation process require the delivery of products whose compositions meet certain specifications. The distillation control system, therefore, must hold product compositions as near the set points as possible in faces of upset. In this project, inferential model predictive control, that utilizes an artificial neural network estimator and model predictive cont...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer Speech & Language

دوره 23  شماره 

صفحات  -

تاریخ انتشار 2009